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Abstract 

The class SCUTZ [Single Lookahead Unit Resolution) was introduced in 
Schlipf, Annexstein, Franco, and Swaminathan ji^] as an umbrella class 
for efficient SAT solving, with in fact linear time SAT decision (while 
the recognition problem was not considered). Cepek, Kucera, and Vlcek 
fl^ ], Balyo, Stefan Gursky, Kucera, and Vlcek Q extended this class 
in various ways to hierarchies covering all of CNF (all clause-sets). We 
introduce a hierarchy SCUTZk which we argue is the natural "limit" of 
such approaches. 

The second source for our investigations is the class UC of unit-re- 
futation complete clause-sets introduced in del Val |^^. Via the theory 
of (tree-resolution based) "hardness" of clause-sets as developed in KuU- 
mann js^, Ansotegui, Bonet, Levy, and Manya we obtain a natural 
generalisation UCt, containing those clause-sets which are "unit-refutation 
complete of level k" , which is the same as having hardness at most k. Util- 
ising the strong connections to (tree-)resolution complexity and (nested) 
input resolution, we develop fundamental methods for the determination 
of hardness (the level k in UCk)- 

A fundamental insight now is that SCUTZk = UCk holds for all k. We 
can thus exploit both streams of intuitions and methods for the investiga- 
tions of these hierarchies. As an application we can easily show that the 
hierarchies from B] are strongly subsumed by SCUTZk- 
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Finally we consider the problem of "irredundant" clause-sets in UCk- 
For 2-CNF we show that strong minimisations are possible in polynomial 
time, while already for (very special) Horn clause-sets minimisation is NP- 
complete. We conclude with an extensive discussion of open problems and 
future directions. 
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1 Introduction 

The boolean satisfiability problem, SAT for short, in its core version is the 
problem of deciding satisfiability of a conjunctive normal form (clause-set) F; see 
the handbook Biere, Heule, van Maaren, and Walsh for further information. 
An important theme is the the search for relevant classes C of clause-sets F 
for which one can (at least) decide satisfiability in polynomial time (that is, 
deciding whether F logically implies the empty clause); see Section 1.19 in 
Franco and Martin for some basic information. For the task of knowledge 
compilation one wants more from the target-class C, namely that the clausal 
entailment problem (deciding whether F logically implies some given clause) can 
be decided in polynomial time; see Darwiche and Marquis jlTt for an overview. 
In this report now we bring together two previously unconnected streams of 
research from these two areas: 

SLUR The SLUR algorithm is an incomplete linear-time SAT-decision algo- 
rithm, based on look-ahead via unit-clause propagation. 

UC The class UC of unit-refutation complete clause-sets enables clausal-entail- 
ment decision in linear time via unit-clause propagation. 

In Subsections |1.1[ , we will disc uss t hese two streams in turn, while their 



unification is outlined in Subsection 1.3, and applications to "SAT knowledge 



compilation" are discussed in Subsection 1.4. This is the underlying report of 



the conference- version Gwynne and KuUmann [p8|, while the journal- version is 



G Wynne and KuUmann |27 . 



1.1 The quest for SLUR hierarchies 

In the year 1995 in Schlipf et al. the SLUR algorithm was introduced, a 
simple incomplete non-deterministic SAT-decision algorithm, which always suc- 
ceeded on various classes with polynomial-time SAT decision where previously 
only rather complicated algorithms were known. The computation is divided 
into two phases for input-clause-set F: First we check via unit-clause propaga- 
tion (UCP) for unsatisfiability. If this check fails, then we assume F is satisfiable, 
and guess a satisfying assignment, using UCP-look-ahead for the guessed assign- 
ments to avoid obviously false assignments. The class SCUTZ contains those F 
where this algorithm always succeeds (i.e., always finds a satisfying assignment 
in the second phase). 
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So recognition of SCIATZ. seems a non-trivial problem, while SAT decision for 
F 6 SCIATZ can be done in linear time. The natural question arises, whether 
SCUTZ can be turned into a hierarchy, covering in the limit all clause-sets. A 
generalisation of SLUR has been considered in Franco and Schlipf under 
the name "ISLUR" (improved SLUR), allowing a polynomial number p{i{F)) 
of backtracks (for a fixed polynomial p, in the input-size i{F)), in the unsat- 
isfiability as well as in the satisfiability phase of the SLUR algorithm, before 
giving up. It is mentioned that ISLUR gives up on every large enough "sparse" 
clause-set (which are "typical" as random k-CNF clause-sets), when no variable 
occurs "too often" . This was considered to be "disappointing" — but from our 
point of view the value of the class SCUTZ lies not in being a "big" class of 
clause-sets with polynomial-time SAT solving, but in establishing a basic target 
class for representations of boolean functions with very strong properties via 
clause-sets; see Subsection 1.4 for further discussions. For all fixed k there ex- 
ists a polynomial p such the fc-th level of our hierarchy, SCUTZk, is contained in 
the class ISLUR (those clause-sets where the ISLUR algorithm never gives up) . 
So all levels are negligible when considering the above sparse clause-sets, but as 
we will argue in Subsection 1.4, nevertheless this hierarchy is proper regarding 
good representations of boolean functions, and the parameter k is meaningful 
and robust (not just a numerical parameter like the polynomial p). 

In Cepek et al. Balyo et al. Q the authors finally proved that mem- 
bership decision of SCUTZ is coNP-complete, and presented three hierarchies, 
SCUTZ{k),SC14TZ*{k) and CANON(fc). It still seemed that none of these hier- 
archies is the final answer, though they all introduce a certain natural intuition. 
We now present what seems the natural "limit hierarchy" , which we call SCUTZk, 
and which unifies the two basic intuitions embodied in S ClATZik) , S CLlTZ*{k) on 
the one hand and CANON(A;) on the other hand. 

In order to do so we need a precise analysis of the SCLlTZ-c\a,SB. We intro- 

SLUR. 

duce the SLUR transition relation F > F' between clause-sets F, F' , which 

makes precise one non-deterministic step of the SLUR-algorithm. This transi- 
tion from F to F' happens when assigning a (single) literal in such a way that 
UCP does not create the empty clause. The core of the classes SClATZ{k) and 
S CLlTZ*{k) is to strengthen the transition relation by requesting that not just 
one literal is choosable, but actually k literals can be chosen, while the difference 
between them is that SCUTZ*{k) performs UCP inbetween the choices, while 
the weaker class SCUTZ[k) does not. 

Before we can describe our solution, the 5£W7^fe-liierarchy, we need to dis- 
cuss the second source of our approach, the class UC of "unit-refutation complete 
clause- set s" , which is related to the stream embodied by CANON(A:). 



1.2 Unit-refutation completeness and "hardness" 

In the year 1994 in del Val the class UC was introduced, containing clause- 
sets F such that clausal entailment, that is, whether F \= C holds (clause C 
follows logically from F, i.e., C is an implicate of F), can be decided by unit- 
clause propagation. The motivation was knowledge compilation, that is, to have 
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a more succinct alternative to the use of the set of all prime implicates of a given 
clause-set Fq (clausal database), for which one seeks an equivalent F such that 
clausal entailment can be decided quickly. 

A second development is important here, namely the development of the 
notion of "hardness" in KuUmann |3^ , Ansotegui et al. Q . The first source 
[ p6| from 1999 introduced the notion of hardness as a measure hdo : C£S Nq, 
assigning natural numbers to clause-sets in the following way (using SAT C 
CCS for the satisfiable clause-sets, and US AT ■— CCS \ SAT): 

• hdo(F) := for the simplest clause-sets F e CCS regarding SAT decision, 
containing the empty clause (i.e., ± G F) or being empty (i.e., F = T)J^ 

• hdo(F) = fc > 1 iff there is a literal x such that for F' := (a; — s> 0) * F 
(setting X to 0) we have hdo(F') < fc — 1 and either F' G USAT and 
hdo((a; -> 1) * F) < fc, or F' £ SAT. 

The second source from 2004 generalised this approach to constraint sat- 
isfaction problems (and beyond). The third source [|] from 2008 considered 
hdo(F) on unsatisfiable clause-sets F G USAT, relating it to backdoors, cycle- 
cutsets and treewidth, and performing an experimental study on random in- 
stances. Also in ||] we find a different extension of hdo : USAT —t- No to a 
measure hd : CCS Nq, using for satisfiable instances F G SAT the maximi- 
sation over all unsatisfiable sub-instances obtained by applying partial assign- 
ments. This hardness notion is harder to measure: as we show in this report, 
determining whether hd(F) < k holds for a fixed fc > 1 is coNP-complete, while 
hdo(F) < fc can be decided in polynomial time (for fixed fc). Nevertheless it is 
the central measure for this report, and we consider it as measuring "represen- 
tation hardness" , while hdp measures "solver hardness" 



As we show in Theorem 5.7, hd(F) < fc is equivalent to the property of 
F, that all implicates of F (i.e., all clauses C with F |= C) can be derived by 
fc-times nested input resolution from F, a generalisation of input resolution as 
introduced and studied in ^ So we obtain that UC is precisely the class of 
clause-sets F with hd(F) < 1 ! It is then natural to define the hierarchy UCk via 
the property hd(F) < fc. The hierarchy CANON(fc) is based on resolution trees 
of height at most fc, which is a special case of fc-times nested input resolution, 
and so we have CANON(fc) c UCk- 



^'Actually a two-dimensional family hd^ 5 of such measures was introduced, based on 
oracles tl C IAS AT, S C SAT for deciding unsatisfiability resp. satisfiability, and setting 
hd^.sC^) := for F S: U U S. In this report we consider only the simplest base case hdo = 
hdug gg, whore Uq := {-F £ CCS : 1. G F} and S := {T}. Oracle S does not plav a role in 
the setting of this report, which is fully unsatisfiability-based. See Subsection 6.3 for more 



information on these hierarchies, and see Subsection for an outlook on rela tivis ed hardness. 

^^hd{F) actually captures tree-like resolution (in a sense). In Subsection \).5\ we discuss a 
width-based measure of hardness, which captures dag-like resolution. We consider the tree- 
hardness as the natural starting point. 

Equivalently, as shown in [ p^ , one can say that all implicates C have a tree-resolution 
proof using space at most A; + 1. 
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1.3 Bringing SLUR and UC together 

In order to get back to SLUR, we need to emphasise the two-sided nature of 



the hardness measure, as developed in |3^, In Subsection we discussed 
the proof-theoretic side of it. The algorithmic side is given by the reductions 
rfe : CCS CCS (introduced in [^), which perform certain forced assignments: 

1. ri is UCP, assigning a; — > 1 for unit-clauses {x} until all are eliminated. 

2. T2 is (complete) failed-literal elimination, assigning, while possible, a; — > 1 
for literals x such that the assignment x — ^ yields a contradiction via 
ri; see Section 5.2.1 in Heule and van Maaren for the usage of failed 
literals in SAT solvers (so-called "look- ahead solvers"), and see Section 
7.2.2 in Kullmann |^ for the general explanation of T2 being the "look- 
ahead version" of ri. 

3. In general r^+i is the "look-ahead version" of r^, assigning, while possible, 
x — >■ 1 for literals x such that the assignment a; ^ yields a contradiction 
via rfc. 

For unsatisfiable F the hardness hd(F) is equal to the minimal k such that 
rfc(F) detects unsatisfiability of F, i.e., rk{F) = {!.}. This yields the basic 
observation UC C SCUTZ — and actually we have UC — SCUTZ ! 

So by replacing the use of ri in the SLUR algorithm by r^ (using our analysis 
via the transition relation) we obtain a natural hierarchy SCUTZk, which includes 
the previous SLUR- hierarchies SCUTZ{k) and SCUTZ*{k), and where we have 
SCUTZk = UCk- This equality of these two hierarchies is our argument that we 
have found the "limit hierarchy" for SLUR. 

1.4 Outlook on good representations of boolean functions 



The ideas presented in Subsections 1.1 



to Subsection 1.3 are the main thrust 



for the results of this paper (Sections]! to 0), while in the final Section ^ (and 
also in the outlook in Section ^ we touch upon what we consider as the main 
application area and the main area for future developments of the theory, namely 
a theory of good representations of boolean functions. More precisely, in Section 
H we consider the complexity of finding short equivalent clause-sets of bounded 
hardness for the most basic CNF classes, 2-CNF and Horn clause-sets, and 
we show feasibility for the former, NP-completeness for the latter. We roughly 
outline now the basic ideas on "good representations" in general, while in Section 
^ some more details are presented. 

SAT algorithms have seen an astounding development in the last two decades. 
Especially efficient algorithms, data structures and heuristics have been devel- 
oped. The main bottleneck currently is that the underlying constraint problem 
needs to be represented via boolean CNF, and it is not clear at all how to do 
this so that SAT solving becomes as easy as possible. "SAT modulo Theories" 
(SMT; see Barrett, Sebastiani, Seshia, and Tinelli |^) boosts the representation 
by extending the general method, however it does not yield insights into how to 
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construct the basic representations by CNFs. What is needed is a systematic 
investigation into "good representations" of boolean functions / by clause-sets 
F, with the aim of "intelligent" SAT translations. 

As a first answer, we consider the classes l4Ck as the most basic target classes, 
that is, F G UCk for k "as small as possible" is the (basic) fundamental guideline. 
The motivation for UC was that of a "good representation" , while the motivation 
for SCUTZ was "good SAT solving" — the hierarchies UCk — SCUTZk bring 
these two aspects together, and this in a parameterised way, so that k can be 
traded against the size of F. So the theory of good representations F of boolean 
functions / can be considered as "SAT knowledge representation", where the 
"knowledge" , the boolean function /, must be represented by a clause-set F such 
that all "aspects" of / (most fundamental the prime implicates) are represented 
in such a way that a SAT solver can "understand" this representation. 

What is now the precise relation between the boolean function / to be rep- 
resented, and the representation F, a clause-set? The most basic idea is to 
consider that F as a CNF is equivalent to /, which we write as F = / (more 
precisely, CNF(F) = /). Good representations in this (restricted) setting then 
amount to consider subsets F C prcg(/) of the set of prime implicates of /, 
such that F = f and such that hd(F) and i{F) (the size of F) are in a "reason- 
able" relationship (the lower hd(F) the higher £{F), and so a balance is to be 
sought). The basic conjecture then states that allowing larger hardness yields 
more possibilities for short representations: 

Conjecture 1.1 For every fc e Nq there exists a sequence (/„)„gN of boolean 
functions, such that no polysize-sequence (F'„)„gN (i.e., where (^(f„))„gN is 
polynomially bounded in n) exists with 

• F„ - fn 

• hd(K) < k 

for all n, but where such a sequence (-F'„)„gN exists when allowing hd(-F'„) < k+1. 



Conjecture 9.4 extends this conjecture to include the use of new variables, and 
also refines it by introducing intermediate levels between the hardness-levels P] 
The algorithmic approach for such representations (not using new variables) 
is to systematically search for small F with a given hardness upper-bound. In 
Section |^ one finds the most basic considerations. In Gwynne and Kullmann 
[ p6| we presented some initial experimental results on using this approach for 
the (small) building-blocks like the S-boxes in block ciphers like AES and DES, 



for their SAT-based cryptanalysis (see Subsection 9.S for more information). 



1.5 The Schaefer classes 

We conclude by some remarks on the four main classes from Schaefer's di- 
chotomy result (see Section 12.2 in Dantsin and Hirsch for an introduction. 



*)ln Gwynne and Kullmann |29| we have meanwhile established that Conjecture 



is true. 
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and see Creignou, Kolaitis, and Vollmer I^5| for an in-depth overview on recent 
developments). Our point of view here is that we consider a boolean function 
/ which is either Horn, dual Horn, bijunctive or affine, and we ask for a good 
representation F e CCS of /: 

• If / is Horn or dual Horn, then there is a (dual) Horn clause-set F equiv- 
alent to /, and by Part ^ of Lemma |6.5| we have hd(F) < 1. So obtaining 
a representation F e UC is trivial; however optimising the size of F is 
NP-complete (see Theorem 8.4). 



• If / is bijunctive, then there is a 2-CNF F equivalent to /, a nd b y Part 
I of Lemma U we have hd(F) < 2. Mor eover, by Theorem p.3| we can 
reduce the hardness to or 1 (as we wish) in polynomial time, and that 
by optimal (shortest) such F. 

• If / is afhne, that is, / is the conjunction of m linear equations xi © 
■ ■ ■ ® Xp = over {0, 1} viewed as a 2-element field, with addition ® as 
exclusive-or, then the situation regarding the existence of a representation 
of bounded hardness is not fully understood yet: 

1. If m = 1, then there is precisely one CNF- representation of / without 
new variables, containing 2p~^ clauses and being (trivially) of hard- 
ness 0. So without new variables we have a polysize representation 
of bounded hardness iff p is bounded. 

2. While when allowing new variables, then for m = 1 there is a repre- 
sentation F G UC, as will be shown in Gwynne and KuUmann [ p9[ . 

3. For arbitrary m there is definitely no small representation without 
new variables when the clause-length p is unbounded. When bound- 
ing p, or when allowing new variables, then the existence of a polysize 
F e UCk for some fixed k seems to be an interesting open problem; 
for some partial results see Laitinen, Junttila, and Niemela [|oj . Per- 
haps no polysize representations F S UC exist, even for the "relative 
condition" , where propagation-conditions are posed only for the vari- 
ables in the XOR-clauses; see Bessiere, Katsirelos, Narodytska, and 
Wal sh [p[ for general tools for such lower bounds, and see Subsections 
d.l, 9.4 for more discussions. 



1.6 Overview 

After discussing basic terminology in Section |2|, in Section ^ we discuss SLUR 
and existing extensions. We give a precise (mathematical) definition of the class 
SjZUTZ, achieving a conceptually clear understanding, and based on these con- 
cepts we give precise (mathematical) definitions of the various SLUR hierarchies 
from the literature. In Section ^ we provide the background about generalised 
unit-clause propagation, that is, the reductions r/j : CCS CCS, where CCS 
is the set of all clause-sets and ri is unit-clause propagation. Section ^ then 
introduces the hardness hd : CCS — > Nq and defines the classes UCk C CCS 
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of "unit-refutation complete clause-sets of level k" as those F with hd(i^) < k. 
The first main result is Theorem 5.7, which states that the elements of UCk 
are precisely the clause-sets F where every prime implicate of F can be derived 
by fc-times nested input resolution from F. In Section |6| we develop various tools 
to determine hardness. First we consider various constructions in Subsection 
B.l, Then in Subsection 6.2 we provide tools to show that classes of clause-sets 
have bounded hardness, with applications to common classes and to stability 
properties of the classes UCk - Alternative and generalised hardness-notions are 
considered in Subsection 6^. We conclude by c onsi dering algorithmic ways to 
determine the hardness-measure in Subsection |6.4| . Sectio n f\ introduces the 
SClATZk hierarchy. Our second major result is Theorem 7.4, showing that 



UCk = SCUTZk holds. From this characterisation we derive in Theorem 7.5 
the coNP- complet eness of membership decision for UCk when k > 1. And in 
Theorems 7.6, 7.7 we show that the previous hierarchies are (strictly) included 
in the SCUTZk hierarchy, which we consider as a kind of "completion" , where 
both approaches, based on SLUR and UC, meet. In Section ^ we turn towards 
the problem of finding short equivalent clause-sets of low hardness for a given 
clause-set F. In Theorem |8.3| we show that for F in 2-CNF we can compute 
optimal equivalent clause-sets (of low hardness) in polynomial time. While in 
Theorem 3.4 we show that already for Horn clause-sets F, even when all prime 
implicates are given as part of the input, the decision whether there is an equiv- 
alent clause-set (of low hardness) using at most a given number of clauses is 
NP-complete. We conclude in Section ^ with the summary and an extensive 
discussion of future directions. 



2 Preliminaries 

We follow the general notions and notations as outlined in Klcinc Biining and 
Kullmann [H). We use N = {1, . . .} and No = N U {0}. Based on an infinite 
set VA of variables, we form the set CIT := V^U V-4 of positive and negative 
literals, using complementation. A clause C C CIT is a finite set of literals 
without clashes, i.e., C n C = 0, where for L C CIT we set L := {x : x E L}. 
The set of all clauses is denoted by CC. A clause-set F c CC is a finite set of 
clauses, and the set of all clause-sets is denoted by CCS. For fc G No we use 
k-CCS := {F £ CCS | VC G : |C| < fc} for the set of clause-sets where ah 
clauses have length at most k. 

A special clause is the empty clause _L := £ CC, and a special clause-set is 
the empty clause-set T := G CCS. By lit(i^) := IJ U [Jf we denote the set 
of literals occurring at least in one polarity in F. 

We use var : CIT — VA for the underlying variable of a literal, var(C) := 
{var(a;) : x G C} C VA for the set of variables in a clause, and var(F) := 
[Jc^p var(C) for the set of variables in a clause-set. So lit(i^) = var(i^)Uvar(F). 
The number of variables in a clause-set is n(F) :— |var(F)| G No, the number 
of clauses is c{F) := |F| G No, and the number of literal occurrences is £{F) := 

Ec6i=^|C| GNo. 
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A full clause-set is a clausc-sct F such that each clause contains all variables, 
that is, for all C S F we have var(C) = var(i^). The set of Horn clause-sets 
is HO C CCS, where every clause contains at most one positive literal, while 
nO^ C no is the set of pure Horn clause-sets, where every clause contains ex- 
actly one positive literal. HO C TZHO C C£S is the set of renamable ( "hidden" ) 
Horn clause-sets, which by flipping signs can be turned into a Horn clause-set. 

A partial assignment cp : V ^ {0,1} maps a finite V C VA to truth-values, 
the set of all partial assignments is VASS. A special partial assignment is the 
empty partial assignment () := G VASS. We can construct partial assignments 
via {vi El, . . . ,Vn — > Sn) £ VASS for vi G VA and Si G {0, 1} (which must be 
consistent). We use var(^) := V = dom((^) for the variables in the domain of 
if, and by TASSiy) we denote the set of all "total assignments" for V, that is, 
the G VASS with var((p) = V. And n(<p) := |var(<^)| G No is the number of 
variables assigned by 

For a partial assignment <p G VASS and a clause-set F G CCS the application 
oi (p to F is denoted hy ip * F G CCS, which results from F by removing 
all satisfied clauses (containing at least one satisfied literal), and removing all 
falsified literals from the remaining clauses. A class C C CCS of clause-sets is 
stable under (application of) partial assignments if for all F G C and ip G VASS 
holds ip*F & C. 

A clause-set F is satisfiable (i.e., F G SAT C CCS) if there exists a partial 

assignment (p with 95 * F = T, otherwise F is unsatisfiable (i.e., F G USAT := 
CCS \ SAT) ■ For a clause C the partial assignment (pc € VASS is defined as 
ipc '■= {x ^ Q : X €: C), that is, it sets precisely the literals of C to (and leaves 
all other variables unassigned). For example ip±_ = () and (p{x} = {x 0). 

Two clauses C,D G CC are resolvable if they clash in exactly one literal 
X, that is, C n D = X, in which case their resolvent is {C U D)\ {a;,^} (with 
resolution literal x) . A resolution tree is a binary tree formed by the resolution 
operation. We write T : F h C if T is a resolution tree with axioms (the clauses 
at the leaves) all in F and with derived clause (at the root) C. By Compj^(F) 
for unsatisfiable F the minimum number of leaves in a tree-resolution-refutation 
T : F h _L is denoted. 

A boolean function / is a map / : TASS{V) {0, 1} for some finite V =: 
var(/); we can also use f(p) G {0,1} for ip G 7ASS with var(/) C var((p), 
in which case (p is restricted to var(/). Special boolean functions are 0^ and 
1^ for the constant-0 resp. constant-1 functions with domain V. We write 
f \= g for boolean functions /, g if for all partial assignments ip with vaT{ip) I) 
var(/) U var((7) we have f{(p) = 1 ^ g{ip) = 1. Equivalence of boolean functions 
/, g means f \= g and g \= f (so all 0^ are equivalent, and all 1^ are equivalent). 

The interpretation of clauses C and clause-sets F as boolean functions is 
explicitly denoted by CNF(C) and CNF(F), using the CNF- interpretation (a 
clause as a disjunction of literals, a clause-set as a conjunction of clauses), and 
happens in this report typically implicitly. 

For a boolean function / the set of prime implicates is denoted by prcQ(/), 
the set of all clauses C with f \= C while for C C C holds f^C. (The "0" in 
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prCo(/) resp. prCo(F) in the set of prime implicates of a boolean function or a 
clause-set (interpreted as CNF) shall remind at "false" or "unsatisfiable" , since 
CNF have "falsity" at the core.) So a boolean function / is equivalent to prCo(/), 
that is, more explicitly, to CNF(prCo(/)). As it is well-known, by considering 
any clause-set F equivalent to / and computing the resolution-closure of F, 
followed by subsumption-elimination, we obtain precisely prCo(/). 

We denote by CNF(/) the "distinguished canonical normal form", or the set 
of "minterms of /", that is, the set of clauses C G CC with var(C) = var(/) 
and f ^ C (that is, / ^ CNF(C)). Dually, by DNF(/) we denote the set of 
clauses C € CC with var(C) = var(/) and DNF(C) h / (the "maxterms of /"; 
note that for us a clause is a combinatorial object, and the logical interpretation 
has to be added). In the DNF- interpretation a clause is the conjunction of its 
literals, and a clause-set is the disjunction of its clauses. 

Finally, by ri : CCS — )■ CCS unit-clause propagation is denoted, that is 
applying F ^ {x ^ 1) * F as long as there are unit-clauses {x} G F, and 
reducing F {_L} in case of _L G F. In Definition 4.3 the general r^ : CCS — > 
CCS is defined. 



3 The SLUR class and extensions 

The SLUR-algorithm and the class SCUTZ C CCS have been introduced in 
Schlipf et al. [Q. The SLUR-algorithm for input F G CCS is an incomplete 
polynomial-time SAT algorithm, which either returns "SAT", "UNSAT" (in 
both cases correctly) or gives up. This algorithm is non-deterministic, and 
SCUTZ is the class of clause-sets where it never gives up (and thus SAT-decision 
for F G SCUTZ can be done in polynomial time). Due to an observation at- 
tributed to Truemper in Franco |2^, the SLUR-algorithm can be implemented 
such that it runs in linear time. Decision of membership, that is whether 
F G SCUTZ holds, by definition is in coNP, but only in Cepek et al. [|2j it 
was finally shown that this decision problem is coNP-complete. 

The original motivation was that SCUTZ contains several other classes, in- 
cluding renamable Horn, extended Horn, hidden extended Horn, simple ex- 
tended Horn and CC-balanced clause-sets, where for each class it was known 
that the SAT problem is solvable in polynomial time, but with in some cases 
rather complicated proofs, while it is trivial to see that the SLUR-algorithm 
runs in polynomial time. In Franco Franco and Gelder |^ probabilistic 
properties of SCUTZ have been investigated.^ 



In this section we first give a semantic definition of SCUTZ in Subsection 3.1 
In a nutshell, SCUTZ is the class of clause-sets where either UCP (unit-clause 
propagation aka ri) creates the empty clause, or where otherwise iteratively 



^'At this point a popular misur 
result of Schaefer (see Subsection 
six classes of problem instances 



dcrstanding should be avoided: The well-known dichotomy 



1.5) states that under certain conditions there are precisely 
with polytime SAT solving (unless P=NP). However this 
has no bearing on the classes considered here, since they do not fall within the restricted 
framework of Schaefer's theorem. 
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making assignments followed by UCP will always yield a satisfying assignment, 
given that these transitions do not obviously create unsatisfiable results, i.e., 
do not create the empty clause. In order to understand this definition (and its 
various extensions) clearly, we present a precise mathematical (non-algorithmic) 

SLUR, 

definition, based on the transition relation F > F' (Definition 3.3), which 

represents one non-deterministic step of the SLUR algorithm: If ri on input F g 
CCS does not determine unsatisfiability (in which case we have F G SCUTZ), 
then F G SCIATZ iff T can be reached by this transition relation, while everything 
else reachable from F is not an end-point of this transition relation. 

In Cepek et al. jl^, Balyo et al. ||[ recently three approaches towards gen- 
eralising SCUTZ have been considered, and we discuss them in Subsection |3.2| . 
Our generalisation, called SCUTZk, which we see as the natural completion of 
these approaches, will be presented in Section 



3.1 SLUR 

The SLUR-algorithm ( "Single Lookahead Unit Resolution" ) from Schlipf et al. 
[ID is described for input F G CCS as follows: 

1. First run UCP, that is, reduce F ri(F). 

2. If now _L G F then we determined F unsatisfiable. 

3. If not, then the algorithm guesses a satisfying assignment for F, by re- 

SLUR, 

peated transitions F > F', where F' is obtained by assigning one 

variable and then performing UCP, i.e., F' = ri((a:: — > 1) * F) for some 
literal x. 

4. The "lookahead" means that a transition with F' = {_L} is avoided. 

5. The algorithm might find a satisfying assignment in this way, or it gets 
stuck, that is, for the chosen literal both assignments a; — > 1 and x 1 
yield {-L}, in which case it "gives up". 

The SLUR class is defined as the class of clause-sets where this algorithm never 
gives up. The precise details are as follows. First we define the underlying 
transition relation (one non-failing transition from F to F'): 

Definition 3.1 For clause-sets F,F' G CCS the relation F ^^^^) p' holds 
if there is x e lit(F) such that F' = ri((a; -> 1) =(= F) and F' ^ {±}. The 

transitive-reflexive closure is denoted by F ^^>* F' . 

Example 3.2 Considering when we have F ^^) ^ F' and when not: 

1. F ^^^^ T iffF G SAT. 

2. {C} ^^^^) T precisely for all clauses C ^ L. 
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3. {{x,2/},{x,f}}^^T. 

4- {{x,y},{y^z}} lldi^ j (due to e.g. ti{{x -> 1) * {{x,y}, {y, z}}) = T). 

5. F '^"^™> F' does not hold if there is no literal to set, or if ri detects 
unsatisfiability of F' . That is, there are no clause-sets F, F' such that 
any of the following hold: 

. . ^ SLUR ^ 

(a) T > F. 

(b) {L}^^^F. 

. . „ SLUR ^ 

(cj F > F. 

(d) F F' where ri(F') = {!}. 

Via the transition-relation F ^^^^^ p' -^^g now easily define the class 
SCIATZ, which will find a natural generalisation in Definition 7.1 to SCUTZk for 
/c e No (where SCUU ^ SCUUi): 

Definition 3.3 The set of all fully reduced clause-sets reachable from F G CCS 
is denoted by 

slur(F) {F' G CCS \ F F' A e CCS : F' F"}. 

Finally the class of all clause-sets which are either identified by UCP to be 
unsatisfiable, or where by SLUR-reduction always a satisfying assignment is 
found, is denoted by SCUU := {F £ CCS : ri(F) ^ {±} slur(F) = {T}}. 

We could define ^'"^^> as F ^^^^> (x 1) * iff ri{{x 1) * F) ^ ±, and 
this would yield the same class SCUTZ but a different transition relation (one 
would not be forced to immediately make forced assignments). 

Example 3.4 Computing slur(i^) for clause-sets F: 

1. slur(i^) 7^ (in the "worst" case we have F £ slur(_F)J. 

2. slur({±}) = {{±}}. 

3. slur(T) = {T}. 

I slur({C}) = {T} iffC^L. 

5. IfTi{F) = T then slur(i^) = {T}. 

6. slur({{a;,y},{a:,y}})-{T}. 

7. slur({{S,y},{y,z}}) = {T}. 

8. For F {{x,y},{x,y},{x,y},{x,y}} we have slur(F) = {F}. 

9. For F' {{z, x, y}, {z, x,lj\, {z,x, y}, {z, x, y}} we have T, F £ slur(F'). 
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3.2 Previous approaches for SLUR hierarchies 

In Cepek et al. ||l2|, Balyo ct al. Q three hierarchies S£.l{Ti{k),SCUTZ*{k) 
{k e N) and CANON(A:) (fc e Nq) have been introduced. In Section 4 of § 
it is shown that SCUTi{k) C SCUTZ*{k) for all fc G N and so we restrict our 
attention to SCUn*{k) and CANON(fc). 

CANON{fc) is defined to be the set of clause-sets F such that every C G 
prCg(F) can be derived from _F by a resolution tree of height at most k. Note 
that basically by definition (using stability of resolution proofs under application 
of partial assignments) we get that each CANON(fc) is stable under application 
of partial assignments and under variable-disjoint union. 

The S£UTZ*{k) hierarchy is derived in Q from the SCUTZ class by extending 
the reduction ri. We provide an alternative formalisation here, in the same 



manner as in Section 3.1. The main question is the transition relation F ^ 
F' . The 5£Z//7?.*(/c)-hierarchy provides stronger and stronger witnesses that F' 
might be satisfiable, by longer and longer assignments (making "fc decisions") 
not yielding the empty clause: 

Definition 3.5 That partial assignment ip G VASS makes k decisions for 
some k € No w.r.t. F £ CCS is defined recursively as follows: For k — this 
relation holds ifip^F — ti{F), while for fc > this relation holds if either there 
is k' < k such that tp makes k' decision w.r.t. F and ip * F — T , or there exists 
X G lit(-F) and a partial assignment ip' making k — 1 decision for ti({x 1) *F), 
and where (p * F = ip' * ri((a; — >■ 1) * 

Now F ^^^^^^'f p' Jqj. k > I by definition holds if there is a partial assign- 
ment If making k decision w.r.t. F with F' = ip * F , where F' ^ {-L}. The 

„ . , . SLUR*k 

refiexive-transitive closure is >^ . 

Finally we can define the hierarchy: 

slur*(fe)(F) := {F' G CCS \ F SLUR*fc^ ^ p, ^ _^^p„ . p, SLUR.*fc^ 
SCUn*{k) := {FeCCS:slur*ik){F)^{F}^s\ui*{k){F) = {T}}. 

The unsatisfiable elements of SCUTZ*{k) are those F T with slur*(/c)(_F) = 
{F}. By definition each SCUTZ*{k) is stable under application of partial as- 
signments, but not stable under variable-disjoint union, since the number of 



decision variables is bounded by k (in Lemma 6.7 we will see that our hierarchy 
is stable under variable-disjoint union, which is natural since it strengthens the 
CANON(/c)-hierarchy). 

Example 3.6 Some examples for CANON(A:) and SCUn*{k) (k G N): 

1. Consider the unsatisfiable clause-set F :— {{x,y},{x,y},{x,y},{x,y}}. 

(a) F ^ SCUTZ because F is unsatisfiable but ti{F) ^ {-L}. 

(b) F G SCUn*{l) because ti{{x' ^ 1) * F) = {±} for all x' G lit(F) 
and so slur*(l)(F) = {F}. 
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(c) This establishes SCUTl C SCUTZ*{1). 

(d) F e CAN0N(2)\CAN0N(1) because actually all tree-resolution refu- 
tations of F are full binary trees of height 2. 

2. Consider the satisfiable clause-set F' :— {{xi, . . . , a;^} U C | C G F}. 

(a) F' ^ SCUTZ*{k) because F' p ^ where F is unsatisfiable and 
thus -(F IHl^^ T), whence slur*(/c)(F') =^ {T}. 

(b) F' e SjCUn*{k + 1) because we have ri{ip * F') e {T,{_L}} for 
all partial assignments (p of length fc + 1 on variables of F' hence 
slur*(fc)(Fi) = {T}. 

(c) F' Cz CAN0N(2) because the only prime implicate is {xi, . . . ,Xk} and 
actually all its tree-resolution proofs are full binary trees of height 2. 



4 Generalised unit-clause propagation 

In this section we review the approximations of forced assignments as computed 
by the hierarchy of reductions r^ : CCS —5- CCS from |3^ for k gNq. First 
we introduce the semantical notion of forced hterals/assignments in Subsection 



4.1 together with the Hmit-reduction roo : CCS CCS, which ehminates all 



forced assignments. In Subsection 4.2 then the rfc-reductions themselves (elimi- 



nating some forced assignments) are defined and basic properties discussed. In 



Subsection 4.3 finally we introduce generalised (nested) input resolution and 
its main parameter, the "Horton-Strahler number" of the corresponding reso- 
lution tree, generalising the well-known refutational equivalence between unit 
resolution and input resolution, and providing the proof-theoretic background. 

For further discussions of these reductions, in the context of SAT decision 
and in their relations to various consistency and width- related notions, see 
and Section 3 in [Q. It seems to us that the rfc-reductions estab- 
lish the SAT-counterpart to consistency-notions from the constraint literature 
(see Bessiere for an overview). We have the following basic distinction be- 
tween SAT and CSP: SAT has the extremely "thin" clauses, enabling the global 
point of view ("no (or flat) hierarchies"), while CSP has "fat" constraints, the 
"lumping together" of clauses. In the SAT world, the rfe-reductions approxi- 
mate global consistency via approaching all assignments of roo, while in the CSP 
world, consistency means making the constraints stronger and stronger (lump- 
ing more and more clauses together), until only one constraint is left. Thus 
the (stronger) consistency-notions of CSP are more related to width-restricted 
resolution, while, as shown in |3^, the rfc-reductions are much weaker (each 
only using linear space). Making a clause-set F "consistent" in the SAT world 



thus means (to us) to find a "representation" F' of F (see Subsection 9.2 for 
some discussion on "representations"), where via r^ for some fc e No we can 
derive "everything" , which is embodied in its most elementary form in the UCk- 



hierarchy, that is, via the condition F' G UCk (Definition 5.6) 
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4.1 Forced literals/assignments 



Fundamental is the notion of a "forced literal" of a boolean function resp. a 
clause-set^, which are literals which must be set to true in order to satisfy the 
function resp. clause-set: 

Definition 4.1 A literal x is forced for a boolean function f if f \= x, and the 
set of forced literals for f is fi(/) C CIT. A literal is forced for a clause-set F 
if it is forced for CNF(F), and we set fl(F) := fl(CNF(F)). 

Every literal is forced for every 0^. In fact a boolean function / is constant 
zero iff fl(/) = CIT iff there is a literal x with x,x € fl(/)- No literal is forced 
for any 1^ (i.e., fl(l^) = 0). We have for every boolean function / that 

(the index ^^CIT" in the intersection is the "universe" of the sets considered in 
the intersection, which becomes the result if there are no sets to intersect, that 
is, if / is unsatisfiable). More directly we can read off the forced literals from 
the prime clauses, namely x is forced for / iff prcQ(/) n {_L, {x}} ^ 0. 

Example 4.2 Here are some basic determinations o/fl(_F); 

1. fl({±}) = LIT. 

2. fl(T) = 0. 

3. fl({{xi}, . . . , {Xn}}) = {Xl, . . . ,X„}. 

I fl({{x,y},{S,y}})-0. 
5. fl({{x,y},{a:,y}}) = {x}. 

If a; is a forced literal for F, then the forced assignment {x 1) yields 
the clause-set {x ^ 1) * F which is satisfiability-equivalent to F. We denote by 
'^oo{F) E CCS the result of applying all forced assignments to F. Note that F 
is unsatisfiable iff roo(^") = {-L} (while F is uniquely satisfiable after discarding 
variables without influence iff Too{F) ~ T). 

4.2 A hierarchy of reductions 

We now review the hierarchy r^ : CjCS — > CCS, /c G No, of reductions (p6[|), 
which achieves approximating roo by poly-time computable functions. The basic 
idea is that unit-clause propagation in a sense computes the most direct forced 
assignments (at "level k = 1" ) , and generalisations like failed-literal elimination 
(level k = 2) find more forced assignments. 

^'we prefer this logical (and common) terminology over "backbone literal", which is only 
used in a special context 
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Definition 4.3 (||36[) The maps : CCS — > CCS for fc G No are defined as 
follows (for F e CCS): 



{±} if±eF 

F otherwise 



^^^^^p-^ Uk+i{{x ^ 1) * F) if3xe\it{F):rk{{x-^0)*F)^{±} 

1 F otherwise 

Ti is unit-clause propagation, 12 is (full) failed literal elimination. We call r^ 
generalised unit-clause-propagation of level k. In one finds the fol- 
lowing basic observations proven (for fc g No, F G CCS and ip £ VASS): 

• The map r^ : CCS CCS is well-defined (does not depend on the choices). 



rfc applies only forced assignments (and so r^ (F) is satisfiability-equivalent 
to F). 



• rfc(F) is computable in time 0{£{F) ■ n{FY'^^ and linear space. 

• rfc(F) = {_L} implies iki'^ * F) = {±}. 

• rfc((p*rfe(F)) ^TkiiffF). 

Quasi-automatisation of tree-resolution is achieved for inputs F G US AT by 
applying rQ(F), ri(F), . . . until unsatisfiability has been achieved (|0). Also 
satisfiable instances are handled in , however in this paper we do not consider 
these algorithmical aspects. 

Actually, a more general form was introduced in [^^, namely r^ for some 
oracle U deciding unsatisfiability at level 0. We believe that this generalisation 



is important for further progress (see Subsection 3.4), however in this report we 
only consider the trivial oracled = {F G CCS : 1. G F}, which (only) recognises 
unsatisfiability at level iff the empty clause occurs. A further generalisation 
to constraint-like systems (via an abstract, axiomatic approach) was achieved 
in [^^, however in this initial study we do only consider boolean values and 
CNF-representations. 

Example 4.4 Computing some Tk{F) (using literals xi, . . . ,Xn,x,y with pair- 
wise different underlying variables): 

1. Tk{{±}) = {±} fork>0. 

2. rfe(T) = T fork>0. 

3. For F {{xi}, . . . , {x„}}; ro(F) = F, rfe(F) - T /or fc > 1. 

I For F' := FD {{x,y}}: to{F') = F' , ik{F') ^ {{x,y}} for k > 1 (note 
that {{x,y}} has no forced assignments). 

5. For F := {{x,y}, {x,y}}: ik{F) ^F fork<l, rk{F) = T fork>2. 
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6. For F {{x,y),{x,y},{x,v},{x,v}}: rfe(F) = F for k < I, Tk{F) = 
{±} for k>2. 

Via the reductions we can approximate tiie implication relation F \= C as 
follows: 

Definition 4.5 (||36|, ^7| |) Fork e No, clause-sets F and clauses C the relation 
F\^kC holds ifrki^c * F) = {±}. 

As it is well-known, F \=i C iff some subclause of C follows from F via input 
resolution. 

Example 4.6 Consider k eNq and literals x,y,w: 

1. For all k > and all clauses C we have: 

(a) F C if there is D e F with D CC (note ±eipc * F). 

(b) {±} hfe C and T ^fe C. 

2. {{x,v),{x,y}} K- W iffk>l. 

3. For F := {{x, y}, {y, z}} we have F {x, z} iff k > I. 

4. For F := {{x,y,w},{y,z,w},{x,y,w},{y,z,w}} we have F {x,z} iff 
k>2 (note that {x ^ 1, z 0} 't' F & 2~C£S ). 



4.3 Generalised input resolution 

In Chapter 4, the levelled height "h{T)" of branching trees T has been 
introduced, which was further generalised in [^^, Chapter 3 (to a general form 
of constraint satisfaction problems). It handles satisfiable as well as unsatisfiable 
clause-sets. In this report we will only use the unsatisfiable case. In this case 
the measure reduces to a well-known measure which only considers the structure 
of the tree. As discussed in Subsections 4.2, 4.3 of this case, the levelled 
height of splitting trees for unsatisfiable clause-sets, appeared at many places 
in the literature. Ansotegui et al. used the term "Horton-Strahler number" 
(sometimes also " ^trahler number "): it seems the oldest source (from 1945), 



however disconnected from its various (re-)inventions in computer science. As 
in Ansotegui et al. the Horton-Strahler number of the trivial tree is 0. 

Definition 4.7 Consider a resolution tree T. The Horton-Strahler num- 
ber hs(T) e No is defined as hs(T) := 0, if T is trivial (consists only of 
one node), while otherwise we have two subtrees Ti,T2, and we set hs(r) := 
max(hs(Ti),hs(T2)) i/hs(Ti) ^ hs(T2), while m case o/hs(Ti) = hs(T2) we set 
hs(T) := max(hs(Ti),hs(T2)) + 1. 

See Sections 4.2, 4.3 in for various characterisations of hs(T). 
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Example 4.8 Examples of trees with their Horton-Strahler numbers. We de- 
note by Ti and T2 in each example the left and right sub-trees of the root. 



Tree T 


hs(T) 


Explanation 







trivial tree 


/ \ 


1 


hs(Ti) = 0, 
hs(T2) = 0. 


/ \ 

/ \ 


1 
1 


hs(ri) =0, 
hs(T2) = 1. 


/ \ 
/ \ 
/ \ 


1 


hs(ri) =0, 
hs(r2) = 1. 




2 


hs(Ti) = I, 
hs(T2) = l. 


/ \ \ 

/ \ / \ 


2 


hs(Ti) = 1, 
hs{T2) = 2. 



In 1 36 , Section 7 (generalised in [^^, Section 5), generalised input resolution 
was introduced. We use the notation "hfe" for it: 



Definition 4.9 (||36|, ^) For a clause-set F e CCS and a clause C 'ECC the 
relation F hk C (C can be derived from F by k-times nested input resolu- 
tion) holds if there exists a resolution tree T and C' Q C with T : F \- C and 
hs(r) < k. 

By parts 1 and 2 of Theorem 7.5 in [M, generalised in Corollary 5.12 in [P7|: 



Lemma 4.10 (|36, ^7[]) For clause-sets F, clauses C and k E we have 
F C if and only if F C. 



5 Hardness 

This section is devoted to the discussion of hd : CCS — s- Nq. It is the central 
concept of the paper, from which the hierarchy UCk is derived (Definition [5.61 ). 
The basic idea is to start with some measurement h : USAT No of "the 
complexity" of unsatisfiable F. This measure is extended to arbitrary F e CCS 
by maximising over all "sub-instances" of F, that is, over all unsatisfiable cp* F 
for (arbitrary) partial assignments if. A first guess for h : USAT — > No is to take 
something like the logarithm of the tree-resolution complexity of F. However 
this measure is too fine-grained, and doesn't yield a hierarchy like UCk, where 
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each level brings a qualitative enhancement. Another approach is algorithmical, 
measuring how far F is from being refutable by unit-clause propagation. As 
shown in actually these two lines of thought can be brought together 

by the hardness measure hd : US AT — > No . Why only tree-resolution, and not 
dag-resolution (i.e., full resolution)? The tree-resolution approach is the natural 
starting point, and what is easy for tree-resolution is also easy for dag-resolution. 
Our basic approach towards the more complicated handling of dag-resolution is 
shown in Subsection 9.5 . 

The outline of this section is as follows. hd(F) is defined and discussed for 



unsatisfiable F in Subsection 5.1, The general case (arbitrary F) is handled in 



Subsection 5^ by reduction to the unsatisfiable cases within F (as produced by 
applying partial assignments) . The central result of this section can be seen in 
Theorem 5.7, which shows that F G UCk (i.e., hd(F) < k) is equivalent to the 



condition that all prime implicates of F can be derived by some resolution tree 
with a Horton-Strahler number at most k. In this way some form of geometric 
intuition is gained, and a machinery becomes available. The first applications 
are given by the various lemmas in Section ^ for determining hardness under 
various circumstances. 

We remark that, when considering only unsatisfiable clause-sets F, in ^ 
actually a general concept of "hardness" was introduced, parameterised by an 
oracle U C US AT for ( "easy" ) detection of special cases of unsatisfiability. In 
this report only U — {F £ CCS : ± S F} is used, but we expect the general 
theory to become important in the future. See Subsection 9.4 for some further 
discussions. 



5.1 Hardness of unsatisfiable clause-sets 

In the following hardness parameter was introduced and investigated (fur- 
ther generalised in M): 



Definition 5.1 (|36, 37]) The hardness hd(i^) of an unsatisfiable F G CCS 



is the minimal k G Nq such that Tk{F) — {^}. 

As shown in Q, hd(i^) -I- 1 is precisely the clause-space complexity of F regard- 
ing tree-resolution (see Nordstrom for a recent overview on space complexity 
of resolution) . In ^ the notation "h(F)" was used (resp., more generally, 
"hiY,5(i^)", using oracles for unsatisfiability and satisfiability detection), which 
seems now to us too unspecific. From Henschen and Wos we gain the insight 
that for F G US AT holds hd(f ) < 1 iff there exists F' <Z F which is an unsat- 
isfiable renamable Horn clause-set (i.e., F' G TZHO CiUSAT). By Theorem 7.8 
(and Corollary 7.9) in (or, more generally. Theorem 5.14 in jjs^) we have 
for F G US AT: 

2hd(F) < Comp* (F) < {n{F) + l)hd(^). 
Example 5.2 Some basic determinations ofhd{F) for unsatisfiable F: 
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1. hd{F) = Oiff±eF. 

2. hd({M,M}) = i. 

3. hd{{{x},{x,y},{y,z},{z]}) = l. 

4- hd({{a;,y},{a;,y},{S,?/},{x,y}}) = 2. 

5. \\d{{{x,y},{x,y},{y,z},{y,z},{x,y,z},{x,y,z}}) = 2. 



By Lemma 4.10 we get: 

Lemma 5.3 ([^, |37^ ) For an unsatisfiahle clause-set F and fc G Nq we have 
MiF) < k ijf F ± iff F ±. 

By applying partial assignments we can reach all hardness- levels in a clause-set, 
as the following lemma shows. 

Lemma 5.4 For an unsatisfiahle clause-set F and every < k < hd(F) there 
exists a partial assignment ip with n{ip) ~ k and hA{Lp * F) — hA{F) — k. 

Proof: We proceed by induction on n{F). As k < hd(F) < n{F), for the 
base case we consider n{F) — k. If n{F) ~ k then all (p with n{Lp) — k have 
hd{(p*F) = hd({_L}) = = hd(F) — fc. For n{F) > fc, we make a case distinction 
on the value of A:. If fc = then choose (p = {). li k = 1 then: 

1. Assume for the sake of contradiction that there is no x e lit(F) such that 
hd((a; — > 1) * F) = hd(F) — 1; otherwise we are done. 

2. If for ah X e lit(F) we had hd((x 1)*F) < hd(F)-2 then by Definition 

we would have hd(F) < fc — 1, a contradiction. 

3. Therefore there must exist an a; G lit(F) such that 

hd(F) = hd((x ^ 1) * F) > hd((a; 0) * F) + 1. 

4. By induction hypothesis we have a partial assignment ip with n(Lp) — 1 
such that M{ip * {{x 1) * F)) = hd(F) - 1. 

5. Application of partial assignments doesn't increase hardness (Lemma 3.11 
of [^) and so we have 

hd((p * F) > hd((.T ^ 1) * ((^ * F)) = hd(F) - 1. 

6. By our choice of x we have 

hd((.T ^ 1) * (^*F)) = hd(F)-l 
hd((.T ^ 0) * (y'^F)) < hd(F)-2, 

therefore by Definition ^.1| we have hd{Lp * F) < hd(F) — 1. 

7. Thus we have that hd{ip * F) = hd(F) - 1. 

Finally, for fc > 1, we apply induction using the fc = 1 case; once we can reduce 
by 1 we can reduce by fc. □ 
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5.2 Hardness of arbitrary clause-sets 

The hardness hd(i^) of arbitrary clause-sets can now be defined as the maximum 
hardness over ah unsatisfiable instances obtained by partial assignments. 

Definition 5.5 The hardness hd{F) £ No for F G CCS is the minimal k G 
No such that for all clauses C with F \^ C we have F \=k C (recall Definition 



by Lemma (AC this is equivalent to F C). 



In other words, if ^ 7^ T then hd(i^) is the maximum of hd((y5 =1= F) for partial 
assignments ip such that (p* F € US AT- To our knowledge, the measure hd(i^) 
for satisfiable F was mentioned the first time in the literature in Ansotegui 
et al. Q , Definition 8 (the only result there concerning this measure is Lemma 
9, relating it to another hardness-alternative for satisfiable F). Note that one 
can restrict attention in Definition to C G prCo(F). Hardness means 
that all prime clauses are there, i.e., hd(F) = iff prCo(F) C F. Especially 
hd(T) = 0. 



Lemma ^A, stating that hd{ip*F) takes exactly the values from to hd(i^), 
extends by definition to satisfiable F G CCS, when adding to the size of the 
partial assignment cp the minimum size of a partial assignment ip with tp * F G 
US AT and hd(V' * F) = hd(F). 

Definition 5.6 For k € Nq let UCk ■= [F G CCS : hd{F) < k} (the class of 
unit-refutation complete clause-sets of level k). 

The class UCi has been introduced in del Val for knowledge compilation. 
Various (resolution-based) algorithms computing for clause-sets F some equiv- 
alent set F' G UCi of prime implicates are discussed there. Based on the results 



from [ p6| , 37 , we can now give a powerful proof-theoretic characterisation for 
all classes UCk'- 

Theorem 5.7 For fc G No and F G CCS we have 

F eUCk ^ VC G prCo(F) --FhkC- 

Thus if every C G prCo {F) has a tree-resolution refutation using at most 2*^+1 -1 
leaves (i-C-, Compl^ {tpc * F) < 2^+'^), then \\d{F) < k- 

Proof: The equivalence F G UCk VC G prCo(i^) : F hfc C follows from 



Lemma [4.10| . And if hd(F) > fc, then there is C G prcQ(F) with F \fk C, and 
then every tree-resolution derivation of C from F needs at least 2'°+-'^ leaves due 
to 2*^'^(^c*i=^) < Comp*^{(pc * F) (as stated before). □ 

Example 5.8 Here are some basic calculations of hardness for satisfiable clause- 



sets (for unsatisfiable F see Example ), using Theorem 5/) 

1. hd(T) = 0. 

2. hd({{a;}}) = 0. 
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3. For F := {{x,y}, {x,y\} we have hd{F) = 1; 
(a) prCo(F) ^ {{x}} . 

(h) hd{{x ^0)*F)^ hd({{y}, {y}}) = 1. 
4-. For F := {{x, y}, {y, z}} we have hd(_F) = 1; 

(a) prCo(F) = {{{x,y}, {y, z}, {x, z}}}. 

(b) hd{{x ^ 1, y ^ 0) * F) = hd({±}) = 0. 

(c) hd((y -^l,z^O)*F)= hd({±}) = 0. 

(d) hd((a; ^ 1, z ^ 0) * F) = hd({{y}, {y}}) = 1. 

5. For F :— {{z, x, y}, {z, x, y}, {z,x, y}, {z,x,y}} we have hd(F) = 2; 

(a) prCo(F) = {{z}}. 

(b) hd({z ^Q)*F)= hd{{{x, y}, {x, y}, {x, y}, {x, y}} = 2. 



6 Fundamental properties of UCk 

In Subsection |6.l| we determine hardness for various constructions. In Sub- 



section 6.2 we consider various classes contained in some UCk together with 
stabihty properties of UCk- Relations to alternative hierarchies from the lit- 



erature are discussed in Subsection 6.3. We conclude our discussion of basic 



properties of hardness in Subsection 6.4, considering the most basic cases of 



precise hardness-computations. We stress that (algorithmic) computation of 
hardness for arbitrary instances is less important here[^, since we aim more at 
constructing "soft" (low hardness) representations than measuring hardness of 
given instances. What is needed is a theory to identify general constructions. 

6.1 Some basic hardness determinations 

The following basic lemma follows directly by definition: 



Lemma 6.1 If two clause-sets F and F' are variable-disjoint, then we have: 

1. IfF, F' e SAT, then hd{F U F') = max(hd(F), hd(F'))- 

2. IfFe SAT and F' e US AT, then hd{F U F') = hd(F'). 

3. IfF,F' eUSAT, thenhd{FUF') = min(hd(F), hd(F')). 

Via full clause-sets An with n variables and 2" clauses we obtain (unsatis- 
fiable, simplest) examples with hd(A„) = n, and when removing one clause for 
n > 1, then we obtain satisfiable examples A'^ with hd(A'J = ?i — 1: 



^'decision of membership in WC^ for A: > 1 is coNP-complete, as shown in Theorem 
which seems natural for classes with strong expressive power 



7.5 
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Lemma 6.2 Consider a full clause-set F G CCS (i.e., each clause contains all 
variables). 



1. hd(T) = 0. 

2. If F is unsatisfiable then hd{F) ~ n{F). 

3. IfF^ then hd(F) = n{F) - mincep„„(F) |C^| • 

4- If for F no two clauses are resolvable, then hd(_F) = 0. 

Proof: Part |l| follows by Definition, Part || is Lemma 3.18 in while Part 
^ follows from Part ||. It remains to show Part If F is unsatisfiable, then we 
get Part ||. For satisfiablc F and a partial assignment ip with va.r{(p) C var(i^) 
it is ip * F a full clause-set with n{(p * F) — n{F) — ri{ip), and so the assertion 
follows by reduction to the unsatisfiable case. □ 
The following lemma yields a way of pumping up hardness: 



Lemma 6.3 Consider F e CCS and v € VA\ var(F). Let F' := {C U {v} : 

C e F}U{CU{v} : C e F}. Then we have hd(F') = hd(F) + 1. 

Proof: We have hd(F') < hd(i^) + 1 by definition (if v is not set by the test- 
assignment, then it can be set to an arbitrary value, yielding a forced assignment 
at level hd(i^)). Now consider a partial assignment with var((^) C var(i^), 
ip*F € US AT and hd((^ * F) ^ hd(F). Now also ip*F' e US AT holds, where 
if * F' ^ {C \J {v} : C ip * F}\J {C \J {v} : C e * F}. Thus we have reduced 
the assertion of the lemma to the special case where F G US AT, and where 
hd(F') > hd(i^) -I- 1 is left to be shown. This now follows easily by induction 
on the number of variables. □ 



6.2 Containment and stability properties 

The following fundamental lemma is obvious from the definition: 

Lemma 6.4 Consider C C CCS stable under application of partial assignments 
and A: e No. If C C^ USAT C UCk then C C UCk- 



We apply Lemma 6.4 to various well-known classes C (stating in brackets 
the source for the bound on the unsatisfiable cases). 

Lemma 6.5 Consider F e CCS. 

1. For ip e PASS we have hd{p * F) < hd(F) (by Lemma 3.11 in ^). 

2. hd(F) < n{F) (by Lemma 3.18 m J^). 

3. If F e 2-CCS ^ {F e CCS | VC e F : |C| < 2}, then hd(F) < 2 (by 
Lemma 5.6 in jS^/j. 
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4- IfFenO = {Fe CCS I VC e : |C n V^l < 1} (Horn clause-sets), 
then M{F) < I by (Lemma 5.8 in Mj). 



More generally, if F £ QHO, the set of q-Horn clause-sets (see Section 
6.10.2 in Grama and Hammer JT^, and van Maaren ^^), then hd(F) < 2 
(by Lemma 5.12 in 



6. Generalising Horn clause-sets to the hierarchy HOk from Kleine Biining 
^ (with nOi no): if F e nOk for k eN, then hd(F) < k (by 
Lemma 5.10 in 

Obviously Part^ of Lemma 6^ can be generalised to G TZHO (see Lemma 6.7 , 
Part S). And considering Part ^, by a standard autarky-argument for 2-CCS 
(see p5[) we can sharpen the hardness- upper-bound 2 for satisfiable clause-sets: 



Lemma 6.6 For F ^ 2-CCS n SAT we have hd(F) < 1. 

Proof: Consider a partial assignment Lp with unsatisfiable ip* F. Now we have 
ri((^ * F) = {-L}, since otherwise ri((^ * F) C F, and thus ri(iy9 * F) would be 
satisfiable. □ 
We have the following stability properties: 

Lemma 6.7 Gonsider fc e Nq. 



1. lACk is stable under application of partial assignments (with Lemma 6^ 
Fart ^ this might reduce hardness). 



2. lACk is stable under variable- disjoint union (with Lemma 6.1). 

3. lACk is stable under renaming variables and switching polarities (by defi- 
nition). 

4. lACk is stable under subsumption- elimination (by basic properties of reso- 
lution). 

5. UCk is stable under addition of inferred clauses (by definition; this might 
reduce hardness). 

Example 6.8 Examples for non- stability: 

1. lACo is obviously not stable under removal of clauses. 

2. UCq is not stable under removal of literal occurrences, for example 
{{x,y}, {x,y}} e UCa, but {{x}, {x,y}} i UCa. 

3. UCq is not stable under crossing out of variables, e.g. {{x,y}, {x,y}} £ 
UCq, but when crossing out variable x we obtain {{y}, {y}} ^ UCq. 

4. UCq is not stable under addition of clauses, for example {{x}} e UCq, but 
{{x},{x}}^UCq. 

5. UCq is not stable under addition of literal occurrences, e.g. {{x},{y}} S 
UCq, but{{x,y},{y}}^UCQ. 
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6.3 Alternative hierarchies 



No class UCk is stable under removal of clauses. We will see in this subsec- 
tion that this boils down to the class Uq of clause-sets containing the empty 
clauses not being stable under removal of clauses. Some classes contained in 
lACi however are stable under removal of clauses, for examples renamble Horn 
clause-sets (TZHO), and in Cepek and Kucera hierarchies based on this 
more restricted class have been considered. To understand the connection to 
our approach, some comments on the use of "oracles" in this setting are needed 



(see Subsection ^ for future developments) . 

In H, 13 the hierarchy Gk{U,S) C CCS {k e Nq) has been introduced, 
using oracles U C US AT for unsatisfiability detection and S C SAT for satis- 
fiability detection: 

1. The minimal oracles considered there are Uq := {F e CCS ; 1. G F} and 
So := {T}. 

2. One uses Gl{U, S) Gfc(W, S)mSAT and Gl{U, S) Gk{U, S)r]SAT. 
Since Gl{U,S) does not depend on S, one writes G^iU) := G^{U,S). 

3. For all k € No holds Gl{Uo) = UCk n US AT. On satisfiable instances in 
general the hierarchies are incomparable. 

4. If C C CCS is stable under application of partial assignments, then each 
class Gfe(C) := Gk^C nUSAT,C nSAT) (for fc G No) is also stable under 
partial assignments (Lemma 4.2 in [p7|). So if CnUSAT C UCk' for some 



k' G No, then we have Gk{C) C UCk+k' (using Lemma 6.4). This is the 



5. In pq, fB7| it is assumed that Uq C U holds. This ensures that UCk H 



basis of all inclusion-relations of Section | 
it 

US AT C G°{C) always holds, but in most cases makes classes Gk{U,S) 
unstable under elimination of clauses. 

In Cepek and Kucera two hierarchies (nfc)fcgNoi (Tfc)fcgNo have been 
introduced; the basic motivations and the relations to our hierarchies are as 
follows: 

1. We have Hfc n US AT = GHWHO) and lik n SAT C GUTLHO) (with 
Ho = WHO). Note that wc do not have % C TZUO here. 



2. It is -RHO n US AT C G\{Ua) (Lemma ^ Part |), while WHO n SAT 



is not included in any G\{U,So). More generally we have Hfc CiUSAT C 
Gl^j^iUo) for aU fc > 0. 

3. So the choice of the oracle TVHO is less powerful on unsatisfiable instances 
than the choice of (when going up one level in the hierarchy), while the 
special recognition of satisfiability for WHO is (naturally) not captured 
by any level of the Gfc-hierarchy, when using only the trivial satisfiability- 
oracle So (even using U = USAT does not change this, since this only 
yields full handling of all forced assignments, while a satisfiable instance 
in TZ'HO might not have any forced assignment). 
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4. For fc > 1 we have Ilk n SAT C GliTZnO), where an example for F G 
GlinnO)\Ilk isgivenby {{v}UC : C £ i^'} for some F' e C/:5\nfc 
and V E V^\var(F'). The point is that recognition for the Gk{U,S)- 
hierarchy aheady includes satisfiability-decision (at lower levels), and if 
one branch, here {v — J> 1), yields a satisfiable instance, then the other 
branch ((u — ;> 0)) is not inspected — which however is the case for 11^. 

5. TZTiO is stable under application of partial assignments, and, that is its 
main feature, stable under removal of clauses. This yields that all 11^ 
are stable under removal of clauses, which is the main motivation for this 
choice of the base oracle. 

6. IAq is not contained in any 11^, and thus there are unsatisfiable clause-sets 
of hardness not contained in any given 11^ . 

7. Cepek and Kucera considered also (shortly) the hierarchy C CCS 
(k G No), with Tfc n USAT = GliQHO) and Tfc n SAT C GHQHO), 
based on the stronger oracle QHO D VMO of q-Horn clause-sets (again 
stable under application of partial assignments and removal of clauses). 
We have T^^USAT C Gl^^iUa) for aU fc > (Lemma U, Part|). 



By Lemma S.4 we get: 



Lemma 6.9 For all fc G No we have Hk C UCk+i and Tk C UCk+2 for the 
hierarchies Ilk, Tffe introduced in Cepek and Kucera 



6.4 Determining hardness computationally 

By the well-known computation of prCo(-F') via resolution-closure we obtain: 

Lemma 6.10 Whether for F G CCS we have hd(i^) = or not can be decided 
in polynomial time, namely hd(-F) — holds if and only if F is stable under 
resolution modulo subsumption ( which means that for all resolvable C, D E F 
with resolvent R there exists E £ F with E C R). 

Thus if the hardness is known to be at most 1, we can compute it efficiently: 

Corollary 6.11 Consider a class C C CCS of clause-sets where C C UCi is 
known. Then for F E C one can compute hd(_F) G {0, 1} in polynomial time. 



Examples for C are given by HO C UCi (Lemma 6.5) and in Subsection 3.1 



Another example class with known hardness is given by 2-CCS C WC2 (Lemma 



3.5), and also here we can compute the hardness efficiently: 



Lemma 6.12 For F G 2-CCS one can computehd{F) G {0, 1,2} in polynomial 
time. 
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Proof: One method is to observe that for elements of 2-CCS the set of prime- 
imphcates can be determined in polynomial time, while SAT-decision can be 
done in linear time. More efficient is the following: 

1. Determine first whether F is satisfiable or not. 



2. If F is satisfiable, then hd(F) e {0,1} by Lemma 5.6, and whether 



hd(i^) = or not can be determined by Lemma 6.10 



3. If F is unsatisfiable, then it suffices to compute ro(F) and ri(i^). 



□ 



See Theorem 7.5 for coNP-completeness of determining an upper bound on 
hardness. 



7 The SLUR hierarchy 

We now define the SClATZk hierarchy, generalising SCUTZ (recall Subsection 



3.1) in a natural way, by replacing ri with r^. In Subsection |7.l| we show 
SClATZk — UCk, and as application obtain coNP-completeness of membership 
decision for UCk for A: > 1. In Section W?a we determine the relations to the 



previous hierarchies S CUTZ*{k) and CANON(fc) as discussed in Subsection 3.2 



Definition 7.1 Consider k E Nq. For clause-sets F,F' G CCS the relation 

^ SLURh 



F' holds if there is x £ lit(F) such that F' — rk{{x 1) * F) and 



F' 7^ {-L}. The transitive-reflexive closure is denoted by F ^^^^'"t ^ F' . The 
set of all fully reduced clause-sets reachable from F is denoted by 

slurfc(F) {F' e CCS \ F F' A -3 F" e CCS : F' lli^ p"}. 

Finally the class of all clause-sets which are either identified by Vk to be unsat- 
isfiable, or where by k-SLUR-reduction always a satisfying assignment is found, 
is denoted by SCUlZk :== {F G CCS : rfc(F) {_L} slurfc(F) = {T}}. 



We have SCUTZi = SCUTZ (recall Definition |3. 3D. Note also the following simple 
properties for F G CCS: 

1. T G slurfc(F) ^ F G SAT. 

2. For F' G slurfc(i^) \ {T} we have F' G US AT, and if F G SAT, then 
rk{F')^{±}. 

3. If F G SCUTZk, then F G SAT and F ^^™'°> ^ F' implies F' G SAT. 

SLUR 

Again we could define the transition relation in a less restricted way, as F 4 

{x 1) =i=F iff rfc((x 1) *F) ^ ±, and this would yield the same class SCUTZk- 

Example 7.2 Some examples for SCUTZ2 \ SCUTZi: 
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1. Consider the unsatisfiable clause-set F := {{x, y}, {x, y}, {x, y}, {x, y}}. 

(a) F ^ SCUTZi because F is unsatisfiable but ti(F) ^ {-!-}• 

(b) F G SjCUn2 because i2{F) = {±}. 

2. Consider the satisfiable clause-set F' := {{a;i,X2} U C | C G -F}. 

(a) F' ^ SCUTli = SCUn because F' ^^^^ F = {xi,X2 ^ 0) * F' , 
where slur(F) = {F} and so F £ slur(F'). 

(b) F' G SCLnZ2 because for any if such that F' ^^'^ ip * F' and 
F' =/= T we have one of the following two cases: 

i. ip * F' is satisfiable, and so ip * F' ^ slur2 (F) . 
a. p * F' is unsatisfiable and so {xi 0,X2 0) C p, but this 
contradicts the fact that F' > ^ p* F' . That is, after setting 

either xi or X2 to 0, lookahead with 12 detects unsatisfiability of 
ip * F' and so one can never transition to p> * F' from F' . 

Therefore slur2(F') = {T}. 

More generally we have {{xi, . . . , Xfc} UC | C G i^} G SCUTZ2\SjCUTZ*{k) 
(recall Example S.t ). 



Lemma 7.3 We have for F G CCS , fc G No and a partial assignment p> with 
Tk{p * F) {±} that F > ^ rfe(<p * F) holds. 

Proof: The assignments of p> can be performed via SLUR-fc-transitions. □ 



7.1 SLUR = UC 

For F G UCk there is the following polynomial-time SAT decision: F is unsatis- 
fiable iff rk{F) = And a satisfying assignment can be found for satisfiable 
F via self-reduction, that is, probing variables, where unsatisfiability again is 
checked for by means of rfc. For k — 1 this means exactly that the nondetermin- 
istic "SLUR" -algorithm will not fail. And that implies that F G SCIATZ holds, 
where SCUTZ is the class of clause-sets where that algorithm never fails. So 
UCi C SCUTZ. Now it turns out, that actually this property characterises KCi, 
that is, UCi = SCUTZ holds, which makes available the results on SCUTZ. 

We now show that this equality between UC and SCUTZ holds in full gener- 
ality for the UCk and SCUTZk hierarchies. 



Theorem 7.4 For all fc G No holds SCUTZk = UCk. 

Proof: Consider F G CCS. We have to show F G SCUTZk ^ hd(i^) < fc. For 
F G US AT this follows from the definitions, and thus we assume F G SAT. 

First consider F G SCUTZk. Consider a partial assignment p such that 
(p* F G US AT. We have to show ik{p * F) — {-L}, and so assume ik{p * F) ^ 
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{±}. It follows F ^^^^''>^ Tk{(p * F) by Lemma 7^, whence rk{(p * F) G SAT 
contradicting ip * F ^ USAT- 

Now assume lid(F) < fc, and we show F G SCUTZk, i.e., slurfc(i^) = T. 
Assume there is F' G slurfc(i^) \ {T}. By Property || fo r De finition [7.l| we 
get F' G USAT and VkiF') ^ {±}. However by Lemma ^ Part we get 
hd(i^') < fc, and thus ik{F') = {±}. □ 
It seemed an essential feature of the class SCUTZ, that its most natural 
definition is by the SLUR-algorithm; for example in Franco and Schlipf we 
find the quote "I find it interesting that the algorithm seems simpler than the 
conditions under which it is a decision procedure." By Theorem 7.4 now we 



have a simple characterisation of these conditions, namely that unsatisfiability 
after instantiation is always detected by unit-clause propagation. Using the 
characterisation SCUTZ = UC, we can show coNP-completeness of hardness- 
determination: 



Theorem 7.5 For fixed fc G N the decision whether hd(i^) < k (i.e., whether 
F G UCk, or, by Theorem 7.4, whether F G SCUTZk) is coNP- complete. 



Proof: The decision whether F ^ SCUTZk is in NP by definition of SCUTZk 
(or use Lemma 5.4). By Theorem 3 in Cepek et al. |Q we have that SCUTZ is 
coNP-complete, which by Lemma 6.3 can be lifted to higher k. □ 



7.2 Comparison to the previous hierarchies 



The alternative hierarchies SCUTZ*{k) and CANON(fc) (recall Subsection 3.2) 
do not generalise ri by r^, but extend ri in various ways (maintaining linear-time 
computation for the (non-deterministic) transitions). In this way in Cepek et al. 
[12|, Balyo et al. M rather complicated argumentations arise, in contrast to our 



elegant characterisation of the classes UCk in Theorem 5.7. As a consequence, 
we can give short proofs that the alternative hierarchies are subsumed by our 
hierarchy, while already the second level of our hierarchy is (naturally) not con- 
tained in any levels of these two hierarchies (naturally, since the time-exponent 
for deciding whether a (non-deterministic) transition can be done w.r.t. hierar- 
chy SCUTZk depends on k). 

First we simplify and generalise the main result of Balyo et al. that 
CANON(l) C SCUTZ. By definition we have CANON(O) = UC^. 



Theorem 7.6 For all k G No we have: 

1. CANON(/c) C UCk- 

2. UCi g CANON(/c) (and thus CANON(fc) C UCk for k > 1 ). 

Proof: By Theorem and the fact, that the Horton-Strahler number of a tree 
is at most the height, we see that CANON(fc) C UCk- That UCi % CANON(fc) 
can be seen by observing that there are formulas in T-LO OUSAT with arbitrary 
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resolution-height complexity and so nO % CANON(fc). By UO C UCi we get 
UCi % CANON(fc). □ 
Also the other hierarchy SCUTZ*{k) is strictly contained in our hierarchy: 

Theorem 7.7 For all fc G No we have: 

1. sajn*{k) c soAUk+i- 

2. sajn2 % scun*{k). 

Proof: Part |] follows most easily by using Lemma |6.4| together with the simple 
fact that slur*(fc)(F) = {F} for 7^ T implies rk+i{F) — {-L}; for the strictness 
of the inclusion use Part |. Part | follows from CAN0N(2) ^ SCUn*{k) 
(Lemma 13 in Balyo et al. g), while by Theorem [tJ we have CAN0N(2) C 

scun2. □ 

Part of Theorem 7.7 can not be improved, since SCUTi*{k) and SCUTZk 
are incomparable: 

Lemma 7.8 For k > 2 holds SCUTZ*{k) % SCUUk andSOAUk % SaAn*{k). 



Proof: That SCUTZk % SCUTl^ik) follows by Part | of Theorem That 



S CLnZ*{k) % SCUTZk follows from the fact that for the full unsatisfiable clause- 
set Fk on k variables (i.e., containing all 2^ clauses of length k) we have Fk+i G 
S£.UTZ*{k) by Lemma 10 in Balyo et al. |] but Fk+i ^ SCUTZk by Part | of 
Lemma |6.2|. □ 



8 Optimisation 

We conclude by considering the question of finding, for an input-clause-set F, 



short equivalent clause-sets F' G UCk for fixed k. Definition 3.1 provides the 
appropriate notion of "irredundancy" via the notion of a "fc-base" , where irre- 
dundancy refers to both removal of literal occurrences and removal of clauses. 



In Theorem B.3 we show that the problem is solvable in polynomial time for 
inputs F G 2-CCS, while in Theorem |8.4| we show that the problem is NP- 
complete even when restricting the input to Horn clause-sets with very few 
prime implicates. 



Definition 8.1 A clause-set F is a k-base for some k G NoU{-|-oo} i/hd(F) < 
k, and after removing any literal occurrence or any clause from F, the result F' 
is either not equivalent to F or has hd(i^') > fc. 

Remarks: 

1. Every fc-base F is primal, that is, F C prcQ(i^). 

2. A clause-set F is a 0-base iff F = pi-Cq{F), while F is an 00-base iff F 
is primal and irredundant (removal of any clause yields a clause-set not 
equivalent to F). 
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3. For a given clause-set F, we consider the problem of computing a short- 
est (w.r.t. the number of clauses or the number of literal occurrences) 
equivalent fc-base F', which we call a fc-base for F: 

(a) By Schaefer and Umans for k = oo this problem is I]2-complete. 

(b) A special case of interest here is when F = piCq{F), in which case 
F' Q F must hold. Since all prime implicates are given as input, 
for k < oo the decision problem whether F has a fc-base of size at 



most k (fc is part of the input) is now in NP. In Theorem B.4 we will 
see that this decision problem is actually NP-complete, even under 
rather restricted circumstances. 

Example 8.2 Consider the clause-set 

F ■■= { {V1,V3, V4} , {V2, -^^3, tM^} , fr2, V4}AV2, Vj, Vj} , {^1, ^3 , Vj} , {vi , ^2} }• 
Ci C2 C3 Ci C5 Cg 

and clause-sets Fi := F \ {C5} and F2 := F \ {Cq}. We have that: 

1. F is a 0-base, that is, prcQ(i^) = F. 

We have to show that F is closed under resolution modulo suhsumption. 
We have the following possible resolutions in F with the associated sub- 
suming clauses: C10C2 D Cq, C10C3 D Cq, C20C5 D Cq, C30C5 D Cq, 

2. F, Fi and F2 are the only k-bases (k eNq) that are equivalent to F. 

To show that there are no other k-bases equivalent to F we must show that 
all other subsets of F are not equivalent to F. It suffices to show that the 
clauses Ci, C2, C3, C4 are irredundant (i.e., occur in all primal clause-sets 
equivalent to F) and the clause-set F3 := F \ {0^,0^} is not equivalent 
to F. The irredundancy 0/ Ci, C2, C3, C4 is seen by the fact that they are 
not obtained as resolvents. That F3 is not equivalent to F follows from 
the fact that F3 does not contain positive clauses while F does. 

3. Fi is a l-base (and 2-base) and is equivalent to F but is not a 0-base. 

We have C^oCq — C5 and thus Fi ^6*5. To see hd(_Fi) — 1, observe 
hd(^C5 * ^1) = hd({{W}, {^^2}}) = 1. 

4-. F2 is a 2-base and is equivalent to F but is not a l-base. 

We have (Ci o C3) 0(6*2 o C5) = Ce and thus F2 |= Ce. Furthermore 
hd((^C6 * -^2) = hd({{i^, W}, {w3, W}, {v^,Vi}, {u3, W4}}) = 2. 

5. Thus F is neither a l-base nor a 2-base. 

Theorem 8.3 For clause-sets F G 2-CCS we can compute shortest-size (min- 
imum number of clauses or minimum number of literal occurrences) equivalent 
k-bases F' for all fc G Nq U j+cxj} in polynomial time as follows: 
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1. If F is unsatisfiahle, then the best possibility is F' :— {-L}. So assume in 
the sequel that F is satisfiable. 

2. If F = T , then F' := T. So assume in the sequel that F =^ T . 

3. If F has a forced literal x, then any k-base for F contains {x}, and we 
can split off x by considering an optimal k-base for (x — > 1) * F . So we 
can assume w.l.o.g. in the sequel that F has no forced literals. (Thus F 
as well as 'pvCf^{F) contains only clauses of length equal 2.) 

4- Since all k-bases of F without new variables are subsets ofpvcQ^F), when 
considering "shortest k-bases" now there is no differences between the mea- 
sures c (number of clauses) and £ (number of literal occurrences), and we 
can just speak of "shortest k-bases". 

5. The (unique) 0-base of F, the setpvcQ^F) G 2-CCS of all prime-implicates, 
can be computed in polynomial time by the methods discussed in Section 
5.8 in Crama and Hammer \li 



6. Every oo-base of F without new variables is a l-base (Lemma 6.t), and 
thus w.r.t. k-bases for fc G Nq U {+00} only the determination of shortest 
1-bases is left, where the shortest 1-bases are precisely the smallest subsets 
of ptCq{F) equivalent to F. 



7. Finally in Chapter 9 of Chang j\13lj ( affirmed in Hemaspaandra and Schnoor 
^d[]) it is shown how to compute shortest equivalent sets of prime-implicates, 
and thus shortest 1-bases can be computed in polynomial time. 

Theorem 8.4 Consider fc e No U {+00}. 

1. Assume k > 1. The decision problem "For inputs F G HO^ C^i-CCS with 
Pt:Cq{F) — F and m G No, decide whether there is a k-base F' of F with 
c-{F') < m." (note that here F' C F must hold) is NP-complete. 

2. For k = the decision problem "For input F G HO and m G No, decide 
whether there is a k-base F' of F with c{F) < m. " is in P. 

Proof: For Part | one enumerates with polynomial delay the prime implicates 
of F (see Section 6.5 in Crama and Hammer for efficient methods): if this 
process stops with at most m prime implicates found, then the answer is "yes" , 
otherwise the answer is "no" . 

For Part |l| we first note that the problem is in NP, since all prime clauses are 
given, and hd(F) < 1. The heart of the completeness is Theorem 6.18 in Crama 
and Hammer , which states that "Horn minimisation w.r.t. the number of 
clauses remains NP-complete even if the input is restricted to cubic pure Horn 
expressions." , plus the fact from the underlying report Boros and Cepek that 
for the considered G G 7^0+ n 3-CCS all prime implicates are also of length at 
most 3, and thus we can take as input F := prcQ(G) G HO^ n 3-CCS (which 
can be computed in polynomial time). □ 
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9 Conclusion and outlook 



We brought together two streams of research, one started by del Val in 
1994, introducing UC for knowledge compilation, and one started by Schlipf 
et al. in 1995, introducing SCUTZ for polytime SAT decision. Two natural 
generalisations, UCk and SCUTZk have been provided, and the (actually surpris- 
ing) identity SCUTZk — UCk provides both sides of the equation with additional 
tools. Various basic lemmas have been shown, providing a framework for elegant 
and powerful proofs. Regarding computational problems, we solved the most 
basic questions. 

Our main future application, which brings the Z^C-perspective and the SCUTZ- 
perspective together, is in the area of "good SAT representations"; see Subsec- 



tion 9.2 for more information. We consider the approach of representing a 
boolean function / via a clause-set F € UCk as the first beginning of what we 
envisage as a theory of good SAT representations. 

We outline now what seems to us the most promising directions for future 
investigations (and where we already have partial results). 



9.1 Propagation-hardness 

Complementary to "unit-refutation completeness" there is the notion of "prop- 
agation completeness" , as investigated in Darwiche and Pipatsrisawat iQ , Bor- 
deaux and Marques-Silva 0. This will be captured and generalised by a cor- 
responding measure phd : CCS No of "propagation-hardness", defined as 
follows: 



Definition 9.1 For F E CCS we define the propagation-hardness (for short 
"p-hardness") phd(i^) G No as the minimal fc G No such that for all partial 
assignments ip G VASS we have 

Now the class VC of "propagation-complete clause-sets" can be properly gener- 
alised: 



Definition 9.2 For k e No let VCk ■= {F G CCS : phd(i^) < k} (the class of 
propagation- complete clause-sets of level k). 

We have VC — VCi. These classes lie (strictly) between the WCfc-classes: 
Lemma 9.3 For k eNo we have VCk C UCk C VCk+i- 



9.2 Good representations of boolean functions 

The real power of SAT representations comes with new variables. Expressive 
power and limitations of "good representations" have to be studied. In the SAT- 
context the most useful notion of "representation" of a boolean function / seems 
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to be Ei-QCNF-representations, that is, clause-sets F with var(/) C var(F), 
where the new variables (in var(i^)\var(/)) are implicitly existentially quantified 
— in other words, the satisfying assignments of F projected to the variables of 
/ are precisely the satisfying assignments of /; see Bubeck and Biining for 
some general results. The restricted representations we already considered in 
Subsection 1.4 are those without new variables, that is, where var(F) = var(/). 

Additional conditions on F are needed to get "effective" representations, 
since in general the evaluation of F for a total assignment for / is an NP- 
problem. Strong representations are those with bounded hardness. Strength- 
ening Conjecture 1.1 from the introduction, we conjecture that also with new 
variables the power of representing boolean functions increases when allowing 
higher hardness: 



Conjecture 9.4 For every fc e No the set of sequences (/n)neN of boolean 
functions having sequences (-F'n)nGN of polysize-representations of p-hardness at 
most k (i.e., phd(-F'ri) ^ k for alln) is strictly .smaller then those having polysize- 
representations of hardness at most k (i.e., hd(F„) < k for all n), which in turn 
is strictly smaller then those having polysize-representations of p-hardness at 
most k + 1 (i.e., phd(i^„) < fc + 1 for all n). 



We wish to remind the reader of the open problem mentioned in Subsection 1.5 
about the existence of a polysize-representation of bounded hardness for affine 
boolean functions. 

We need to emphasise here that representations F of boolean functions / 
with hd(f ) < k fulfil an absolute condition, that is, we can determine un- 
satisfiability by r^ for arbitrary partial assignments, not just those using only 
the variables of /. When only asking for this relative condition (currently 
the standard, posing conditions only on variables occurring in the represented 
boolean function /, ignoring the new variables of F), then by generalising 
Bessiere et al. [D we can show that the hierarchies collapse to the first level. 
This is due to the "uncontrolled" use of the new variables (the relative condition 
doesn't pose conditions on them). See Bordeaux, Janota, Marques-Silva, and 
Marquis 0] for a study on UC together with the relative condition. 



9.3 Applications to cryptanalysis 

As an application of the theory of "good representations" we consider crypt- 
analytic problems, especially attacking AES/DES, as preliminary discussed in 
Gwynne and KuUmann |2^, For the experimental evaluation we consider 
the various boolean functions ( "constraints" ) used by these ciphers, most promi- 
nently the "S-boxes", and systematically search for short representations of 
hardness 0, 1,2 and p-hardness 1,2. Various solvers are then run on the SAT- 
problems obtained by plaintext- / ciphertext pairs (where the task is to determine 
the key). The strengthened inference power seems especially interesting for the 
combination of look-ahead ( "tree- resolution based" ) and conflict-driven ( "dag- 
resolution based") SAT solvers as introduced in Heule, KuUmann, Wieringa, 
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and Biere ||. 



9.4 Relativised hardness 

Generalising Bessiere et al. [D we can show that for example the satisfiable 
pigeonhole formulas PHP™ do not have polysize representations of bounded 
hardness even for the relative condition. One way to overcome this barrier 
is to generalise t he t heory started here via the use of oracles as in ^ 
(recall Subsection |6.3[ ) , and then employing oracles which can handle pigeonhole 
formulas. The basic definitions are as follows. 

Definition 9.5 A valid oracle for generalised unit-clause propagation is some 
U C US AT with £ U which is stable under application of partial assign- 
ments. The oracle is strong ifUo C U, where Uq := {F G CCS : -L G F}. 

Consider fc £ No. In the reduction r^ : CCS — >■ CCS has been defined. 
An equivalent definition (generalising Definition \4.!\ ) is as follows for F £ CCS: 

I r otherwise 

^U^^^P^ |r^_^i((x^l)*F) z/3x£lit(F):r^((a:^0)*F) = {±}^ 

I F otherwise 



Note rfc = r^" . Generalising Definitions |5J^ 



Definition 9.6 Consider a valid oracle hi. The hardness \idu{F) £ No 

("hardness with oracle U") of an unsatisfiable F £ CCS is the minimal fc £ No 
such that r^(F) = {-L}. And for general F £ CCS we define hdu{T) := 0, while 
for F let 

Mu{F) := max{hdiY(^ * F) : ip e VASS A * F e USAT} £ No. 

We have hd = hd^o, and if U is strong then for ah F holds hAu{F) < hd(F). 
An interesting oracle lA (with polytime membership decision) is given by the 
class of unsatisfiable clause-sets defined in de Klerk, van Maaren, and Warners 
[|9| via semidefinite programming, for which we get hdi^(PHP™) = 0. 

9.5 Width-based hardness 

The basic idea is to use width- restricted resolution instead of nested input resolu- 
tion, in order to increase inference power from tree-resolution to dag-resolution. 
A basic weakness of the standard notion of width-restricted resolution, which 
demands that both parent clauses must have length at most k for some fixed 
fc £ No (the "width"), is that even Horn clause-sets require unbounded width in 
this sense. The correct solution, as investigated and discussed in pq, 37 , is to 



use the notion of "fc-resolution" as introduced in Kleine Biining ]34| , where only 
one parent clause needs to have length at most fc (thus properly generalising 
unit- resolution) . 
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Definition 9.7 Consider k eNq. 



• Two resolvable clauses C,D are k-resolvable if \C\ < k or \D\ < k. 

• We use F h'^ C if there is a resolution proof R of some C 'Z C from F 
such that all resolutions in R are k-resolutions. 

This allows us now to define "width-hardness" (accordingly the "hardness" only 
studied in this paper can be called "tree-hardness" ) : 

Definition 9.8 For F e US AT let whd(F) e No he the minimal fc S No such 
that F h'' ± holds. And for F e CCS let whd(i^) e No be the minimal fc e No 
such that for all partial assignments ip holds (p * F E US AT (/3 * F h*^ _L. 

We have whd(F) — k ^ hd(i^) — k ioi k E {0, 1}, while in general whd(F) < 
hd(F) holds (for aU F E CCS). 

Conjecture 9.9 For every A; G No the set of families of boolean functions hav- 
ing polysize representations of width-hardness at most k is strictly smaller then 
those having polysize-representations of width-hardness at most k-\-l. For k > 1 
families showing the separation can be chosen such that they have unbounded 
hardness. 



Finally we mention that, as in Subsection 9.4, we also have a relativised version 



whdu, based on relativised fc-resolution as studied in pq, 37 
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